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PRODUCTION AND USE OF TYPE 5 17BETA-HYDR0XYSTER0ID DEHYDROGENASE 

BACKGROUND OF THE INVENTION 
Field of the Invention 

5 

The present invention relates to the isolation and characterization of a novel 
enzyme which is implicated in the production of sex steroids, and more particularly, 
to the characterization of the gene and cDNA of a novel 20oc, 17p-hydroxysteroid 
dehydrogenase (hereinafter type 5 17p-HSD) which has been implicated in the 
10 conversion of progesterone and 4-androstenedione (A 4 -dione) to 20cc- 
hydroxyprogesterone (20oc-OH-P) and testosterone (T), respectively. The use of this 
enzyme as an assay for inhibitors of the enzyme is also described, as are several other 
uses of the DNA, fragments thereof and antisense fragments thereof. 

15 Description of the Related Art 

The enzymes identified as 17p-HSDs are important in the production of human 
sex steroids, including androst-5-ene-3p,17P-diol (A 5 -dioI), testosterone and estradiol. 
It was once thought that a single gene encoded a single type of 17P-HSD which was 

20 responsible for catalyzing all of the reactions. However, in humans, several types of 
17P-HSD have now been identified and characterized. Each type of 17p-HSD has 
been found to catalyze specific reactions and to be located in specific tissues. Further 
information about Types 1, 2 and 3 17p-HSD can be had by reference as follows: 
Type 1 17p-HSD is described by Luu-The, V. et al., MoL Endocrinol., 3:1301-1309 

25 (1989) and by Peltoketo, H. et al., FEBS Leu. 239:73-77 (1988); Type 2 17p-HSD is 
described in Wu, L. et al., 7. Biol Chem % 268:12964-12969 (1993); Type 3 17p-HSD 
is described in Geissler, WM, Nature Genetics, 7:34-39 (1994). A fourth type which 
is homologous to porcine ovarian 17P-HSD has recently been identified by researchers 
Adamslci and de Launoit, however, applicant is not presently aware of published 

30 information on this type. 

The present invention relates to a fifth type of 17P-HSD which is described in 
detail below. 
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SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a novel 17p-hydroxysteroid 
5 dehydrogenase (17(J-HSD) which is identified as type 5 17p-HSD. 

It is also an object of the present invention to provide a 17P-HSD which has 
been shown to be involved in the conversion of A 4 -dione to testosterone ami in the 
conversion of progesterone to 20oc-hydroxyprogesterone (20oc-OH-P). 

It is a further object of this invention to provide the nucleotide sequences and a 
10 gene map for type 5 17P-HSD. 

It is also an object of this invention to provide methods of using type 5 17p- 
HSD in an assay to identify compounds which inhibit the activity of this enzyme, and 
thus may reduce production of testosterone or 20oc-hydroxy progesterone, and can be 
used to treat medical conditions which respond unfavorably to these steroids. 

It is an additional object of this invention to provide methods of preventing the 
synthesis of type 5 17p-HSD by administering an antisense RNA of the gene sequence 
of type 5 17p-HSD to interfere with the translation of the gene's mRNA. 
These and other objects are discussed herein. 

In particular, a novel enzyme, type 5 1 7 p-hydroxy steroid dehydrogenase, has 
been identified and characterized. The gene sequence for this type 5 17p-HSD was 
found to encode a protein of 323 amino acids, having an apparent calculated molecular 
weight of 36,844 daltons. The protein is encoded by nucleotides +11 through 982, 
including the stop codon (amino acids +1 through 323), numbered in the 5 1 to 3' 
direction, in the following sequence (SEQ ID Nos. 1 and 2): 

GTGACAGGGA ATG GAT TCC AAA CAG CAG TGT GTA AAG CTA AAT GAT GGC 49 
MetAspSerLysGtnGtaCys Val Lys Leu Asn Asp Gfy 
1 5 10 

CAC TTC ATG CCT GTA TTGGGA TTTGGC ACC TAT GCA CCT CCA GAG GTT 97 
His Phe Met Pro Val Leu Qy Phe Gfy Thr Tyr Ala Pro Pro Glu Val 
15 20 25 

CCG AGA AGT AAA GCT TTG GAG GTC ACC AAA TTA GCA ATA GAA GCT GGG 145 
Pro Arg Ser Lys Ala Leu Glu Val Thr Lys Leu Ala lie Qu Ala Gfy 
30 35 40 45 

TTC CGC CAT ATA GAT TCT GCT CAT TTA TAC AAT AAT GAG GAG CAG GTT 193 
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Phe Aig His lie Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val 
50 55 60 

GGA CTG GCC ATC CGA AGC AAG ATT GCA GAT GGC AGT GTG AAG AGAGAA 24 1 
S (ty Leu AJa lie Arg Ser Lys tie Aia Asp Oy Ser Vd Lys Arg Glu 
65 70 75 

GAC ATA TTC TAC ACT TCA AAG CTT TGG TCC ACT TTT CAT CGA CCA GAG 289 
Asp te Phe Tyr Thr Ser tys Leu Tip Ser Thr Phe His A/g Pro Ghi 
10 80 85 90 

TTG GTC CGA CCA GCC TTG GAA AAC TCA CTG AAA AAA GCT CAA TTG GAC 337 
Leu Val Ara. Pro Ala Leu Glu Asn Set Leu Lys Lys Ala Gin Leu Asp 
95 100 105 

15 

TAT GTT GAC CTC TAT CTT ATT CAT TCT CCA ATG TCT CTA AAG CCA GGT 385 
Tyr Val Asp Leu Tyr Leu Ue His Ser Pro Met Ser Leu Lys Pro Gly 
110 115 120 125 

20 GAG GAA CTT TCA CCA ACA GAT GAA AAT GGA AAA GTA ATA TTT GAC ATA 433 
Glu Glu Leu Ser Pro Thr Asp Glu Asn Gly Lys Val lie Phe Asp lie 
130 135 140 

GTG GAT CTC TGT ACC ACC TGG GAG GCC ATG GAG AAG TGT AAG GAT GCA 481 
25 Val Asp Leu Cys Thr Thr Trp Glu AJa Met Glu Lys Cys Lys Asp Ala 
145 150 155 

GGA TTG GCC AAG TCC ATT GGG GTG TCA AAC TTC AAC CGC AGG CAG CTG 529 
Gly Leu Aia Lys Ser Ne Gly Val Ser Asn Pne Asn Arg Arg Gin Leu 
30 160 165 170 

GAG ATG ATC CTC AAC AAG CCA GGA CTC AAG TAC AAG CCT GTC TGC AAC 577 
Ghi Met lie Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn 
175 180 185 

35 

CAG GTA GAA TGT CAT CCG TAT TTC AAC CGG AGT AAA TTG CTA GAT TTC 625 
Gin Val Glu Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe 
190 195 200 205 

40 TGC AAG TCG AAA GAT ATT GTT CTG GTT GCC TAT AGT GCT CTG GGA TCT 673 
Cys Lys Ser Lys Asp tie Val Leu Val Ala Tyr Ser Ala Leu Gfy Ser 
210 215 220 

CAA CGA GAC AAA CGA TGG GTG GAC CCG AAC TCC CCG GTG CTC TTG GAG 721 
45 Gin Arg Asp Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu 
225 230 235 

GAC CCA GTC CTT TGT GCC TTG GCA AAA AAG CAC AAG CGA ACC CCA GCC 769 
Asp Pro VaJ Leu Cys Ala Leu Aia Lys Lys Hs Lys Arg Thr Pro Ala 
50 240 245 250 

CTG ATT GCC CTG CGC TAC CAG CTG CAG CGT GGG GTT GTG GTC CTG GCC 817 
Leu lie Ala Leu Arg Tyr Gin Leu Gin Arg Gry Val Val Val Leu Ala 
255 260 265 

55 

AAG AGC TAC AAT GAG CAG CGC ATC AGA CAG AAC GTG CAG GTT TTT GAG 865 
Lys Ser Tyr Asn QuGto Arg lie Arg Gtn Asn Val Gin Val Phe Glu 
270 275 280 285 
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TTC CAG TTG ACT GCA GAG GAC ATG AAA GCC ATA GAT GGC CTA GAC AGA 913 
Phe Gin Leu Thr Ala Glu Asp Met Lys Ala He Asp Gy Leu Asp Arg 
290 295 300 

5 

AAT CTC CAC TAT TTT AAC AGT GAT AGT TTT GCT AX CAC CCT AAT TAT 961 
AsnLeuHisTyrPheAsnSerAspSefPheAlaSerHisProAsnTyr . 
305 310 315 

1 0 CCA TAT TCA GAT GAA TAT TAA CATGGAGACT TTGCCTGATG ATGTCTACCA 1012 
ProTyrSer AspGfejTyr ' 
320 

GAAGGCCCTG TGTGTGGATG GTGACGCAGA GGACGTCTCT ATGCCGGTGA CTGGACATAT 1072 

15 

CACCTCTACT TAAATCCGTC CTGTTTAGCG ACTTCAGTCA ACTACAGCTC ACTCCATAGG 1 132 
CCAGAAATAC AATAAATCCT GTTTAGCGAC TTCAGTCAAC TACAGCTCAC TCCATAGGCC 1 192 
20 AGAAATACAA TAAA 1206 

In addition, a complete gene map (Figure 5) and nucleotide sequences (SEQ. ID Nos. 
3 through 29 and Figures 6A and 6B) of the chromosomal DNA of type 5 17p-HSD 
are provided. A more detailed description of the sequences will be provided infra. 

25 The present invention includes methods for the synthetic production of type 5 

I7P-HSD. as well as peptides that are biologically functionally equivalent, and to 
methods of using these compounds to screen test compounds for their ability to inhibit 
or alter the enzymatic function. In addition, methods of producing antisense 
constructs to the type 5 170-HSD gene's DNA or mRNA or portions thereof, and the 

30 use of those antisense constructs to interfere with the transcription or translation of the 
enzyme are also provided. 

The nucleotide sequence which encodes type 5 17P-HSD and recombinant 
expression vectors which include the sequence may be modified so long as they 
continue to encode a functionally equivalent enzyme. Moreover, it is contemplated, 

35 within the invention, that codons within the coding region may be altered, inter alia, 
in a manner which, given the degeneracy of the genetic code, continues to encode the 
same protein or one providing a functionally equivalent protein. It is believed that 
nucleotide sequences analogous to SEQ ID No. 1, or those that hybridize under 
stringent conditions to the coding region of SEQ ID No. 1. are likely to encode a type 

40 5 17P-HSD functionally equivalent to that encoded by the coding region of SEQ ID 
No. 1, especially if such analogous nucleotide sequence is at least 700, preferably at 
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least 850 and most preferably at least 969 nucleotides in length. As used herein, 
except where otherwise specified, "stringent conditions' 1 means O.lx SSC (0.3 M 
sodium chloride and 0.03M sodium citrate) and 0.1 % sodium dodecyl sulphate (SDS) 
and 60° C. 

5 It is also likely that tissues or cells from human or non-human sources and 

which tissues or cells have the enzymatic machinery to convert A 4 -dione to 
testosterone, or to convert progesterone to 20x-hy droxyprogesterone , include a type 5 
170-HSD sufficiently analogous to human type 5 17P-HSD to be used in accordance 
with the present invention. In particular, cDNA libraries prepared from cells 

10 performing the foregoing conversions may be screening with probes in accordance 
with well known techniques prepared by reference to the nucleotides disclosed herein, 
and under varying degrees of stringency, in order to identify analogous cDNAs in 
other species. These analogous cDNAs are preferably at least 70% homologous to 
SEQ ID No. 1, more preferably at least 80% homologous, and most preferably at 

15 least 90% homologous. They preferably include stretches of perfect identity at least 
10 nucleotides long, more preferably stretches of 15, 20 or even 30 nucleotides of 
perfect identity. Appropriate probes may be prepared from SEQ ID No. 1 or 
fragments thereof of suitable length, preferably at least 15 nucleotides in length. 
Confirmation with at least two distinct probes is preferred. Alternative isolation 

20 strategies, such as polymerase chain reaction (PCR) amplification, may also be used. 

Homologous type 5 17p-HSDs so obtained, as well as the genes encoding 
them, are used in accordance with the invention in all of the ways for using SEQ ID 
No. 2 and SEQ ID No. 1, respectively. 

Recombinant expression vectors can include the entire coding region for 

25 human type 5 17P-HSD as shown in SEQ ID No. 1, the coding region for human type 
5 17P-HSD which has been modified, portions of the coding region for human type 5 
17p-HSD. the chromosomal DNA of type 5 17p-HSD, an antisense construct to type 
5 17P-HSD. or portions of antisense constructs to type 5 17P-HSD. 

In the context of the invention, "isolated" means having a higher purity than 

30 exists in nature, but does not require purification from a natural source. Isolated 
nucleotides encoding type 5 17P-HSD may be produced synthetically, or by isolating 
cDNA thereof from a cDNA library or by any of numerous other methods well 
understood in the art. 
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In one embodiment, the invention provides an isolated nucleotide sequence 
encoding type 5 17p-hydroxysteroid dehydrogenase, said sequence being sufficiently 
homologous to SEQ ID No. 1 or a complement thereof, to hybridize under stringent 
conditions to the coding region of SEQ ID No. 1 or a complement thereof and said 
5 sequence encoding an enzyme which catalyzes the conversion of progesterone to 20oc- 
hydroxyprogesterone and the conversion of 4-androstenedione to testosterone. 

In a further embodiment, the invention provides an isolated nucleotide 
sequence comprising at least ten consecutive nucleotides identical to 10 consecutive 
nucleotides in the coding region of SEQ ID No. 1 , or the complement thereof. 
10 In an additional embodiment, the invention provides an oligonucleotide 

sequence selected from the group consisting of SEQ ID Nos. 30 through 59. 

In another embodiment, the invention provides a recombinant expression 
vector comprising a promoter sequence and an oligonucleotide sequence selected from 
the group of SEQ ID Nos. 30 to 59. 
15 In a further embodiment, the invention provides a method of blocking synthesis 

of type 5 17p-HSD, comprising the step of introducing an oligonucleotide selected 
from the group consisting of SEQ ID Nos. 30 to 59 into cells. 

In an additional embodiment, the invention provides an isolated chromosomal 
DNA fragment which upon transcription and translation encodes type 5 17P- 
20 hydroxysteroid dehydrogenase and wherein said fragment contains nine exons and 
wherein said fragment includes introns which are 16 kilobase pairs in length. 

In another embodiment, the invention provides an isolated DNA sequence 
encoding type 5 17p-hydroxy steroid dehydrogenase, said sequence being sufficiently 
homologous to SEQ ID No. 3 or a complement thereof, to hybridize under stringent 
25 conditions to SEQ ID No. 3, or its complement. 

In a further embodiment, the invention provides a method for producing type 5 
17p-hydroxysteroid dehydrogenase, comprising the steps of preparing a recombinant 
host transformed or transfected with the vector of claim 3 and culturing said host 
under conditions which are conducive to the production of type 5 17P-hydroxysteroid 
30 dehydrogenase by said host. 

In an additional embodiment, the invention provides a method for determining 
the inhibitory effect of a test compound on the enzymatic activity of type 5 170- 
hydroxysteroid dehydrogenase, comprising the steps of providing type 5 17p- 
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hydroxysteroid dehydrogenase; contacting said type 5 17p-hydroxysteroid 
dehydrogenase with said test compound; and thereafter determining the enzymatic 
activity of said type 5 17p-hydroxysteroid dehydrogenase in the presence of said test 
compound. 

5 In an additional embodiment, the invention provides a method of interfering 

with the expression of type 5 17p-hydroxysteroid dehydrogenase, comprising the step 
of administering nucleic acids substantially identical to at least 15 consecutive 
nucleotides of SEQ ID No. 1 or a complement thereof. 

In a further embodiment, there is provided a method of interfering with the 
10 synthesis of type 5 17p-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA complementary to mRNA encoded by at least 15 
consecutive nucleotides of SEQ ID No. 1 or a complement thereof. 

In an additional embodiment, the invention provides a method of interfering 
with the expression of type 5 17p-hydroxysteroid dehydrogenase, comprising the step 
15 of administering nucleic acids substantially identical to at least 15 consecutive 
nucleotides of SEQ ID No. 3 or a complement thereof. 

In another embodiment, the invention provides a method of interfering with the 
synthesis of type 5 17p-hydroxysteroid dehydrogenase, comprising the step of 
administering antisense RNA complementary to mRNA encoded by at least 15 
20 consecutive nucleotides of SEQ ID No. 3 or a complement thereof. 

In a further embodiment, there is provided a method for determining the 
inhibitory effect of antisense nucleic acids on the enzymatic activity of type 5 17p- 
hydroxy steroid dehydrogenase, comprising the steps of providing a host system 
capable of expressing type 5 17p-hydroxy steroid dehydrogenase; introducing said 
25 antisense nucleic acids into said host system; and thereafter determining the enzymatic 
activity of said type 5 17p-hydroxysteroid dehydrogenase. 

Other features and advantages of the present invention will become apparent 
from the following description of the invention which refers to the accompanying 
drawings. 

30 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB are graphs showing the enzymatic activities of Type 5 17p- 



SUBSTITUTE SHEET (RULE 26) 



WO 97/1 1 162 PCT/CA96/00605 



-8- 

HSD on various substrates. The enzyme was expressed in embryonal kidney (293) 
cells (ATCC CRL 1573) which were transfected with a vector, prepared in accordance 
with the invention, and containing the gene encoding human type 5 17P-HSD. Figure 
1A shows the substrate specificity of type 5 17p-HSD. The concentration of each 
5 substrate was 0.1 /iM. Figure IB shows the time course amount of 20oc-HSD and 
17P-HSD activities of cells transfected with vectors containing human type 5 17p- 
HSD. The substrates, progesterone (P) and A 4 -dione, were added at a concentration 
of 0.1 mM; 

10 Figure 2 is a map of a pCMV vector which is exemplary of one that can be 

used to transfect host cells in accordance with the invention; 

Figure 3 is the cDNA sequence (SEQ ID No. 1) and the deduced amino acid 
sequence (SEQ ID No. 2) of human type 5 17P-HSD. The nucleotide sequence is 
15 numbered in the 5' to 3' direction with the adenosine of the initiation codon (ATG) 
designated as +11. The translation stop codon is indicated by asterisks. The 
potential post modification sites are underlined, wherein TSK = tyrosine sulfokinase; 
CK2 = casein kinase II; PKC = protein kinase C; NG = N-glycosylation; and NM 
= N-myrystoylation; 

20 

Figure 4 is a comparison of the deduced amino acid sequence of human type 5 
17P-HSD to the amino acid sequences of rabbit (rb), rat (r), and bovine (b) 20<x-HSD 
as well as human (h) and rat (r) 3<x-HSD, bovine prostaglandin f synthase (b pgfs) and 
frog p-crystallin (f p-crys). The amino sequences are indicated using the conventional 
25 single letter code and are numbered on the right. The dashes (-) and dots (.) indicate 
identical and missing amino acid residues, respectively; 

Figure 5 is a map of the chromosomal DNA of a gene which encodes type 5 
17P-HSD; and 

30 

Figures 6A and 6B are the nucleotide sequence of the chromosomal DNA of a 
gene which encodes type 5 17p-HSD. 



SUBSTITUTE SHEET (RULE 26) 



WO 97/11162 



PCT7CA96/00605 



DETAILED DESCRIPTION OF THE INVENTION 

A gene encoding the enzyme, type 5 17P-HSD, has been isolated and encodes 
5 a protein having 323 amino acids with a calculated molecular weight of 36,844 
daltons. As shown in Figure 3, the coding portion of this gene includes nucleotides 
+ 11 through 982, including the stop codon (and encodes amino acids +1 through 
323), numbered in the 5' to 3' direction. 

The chromosomal DNA fragment of the gene for type 5 17P-HSD has also 

10 been characterized. A map of the gene is provided in Figure 5. In particular, it was 
found, using primer extension analysis, that the gene includes 16 kilobase pairs (kb) 
and contained nine short exons. A portion of the 5* flanking region, as set forth in 
SEQ ID No. 3, of the genomic DNA includes 730 base pairs (bp). Exon I (SEQ ID 
No. 4) contains 37 nucleotides in the 5'-noncoding region and the nucleotides for the 

15 first 28 amino acids. The second intron region includes the nucleotides set forth in 
SEQ ID Nos. 5 and 6, which are 252 and 410 bp, respectively. These are joined by a 
1.2 kb region which is not important and therefore, its sequence has been omitted. 
Exon 2 (SEQ ID No. 7) contains nucleotides for the following 56 amino acids of 
human type 5 17P-HSD. The following intron region includes SEQ ID Nos. 8 and 9, 

20 700 and 73 bp, respectively, which are joined by a 0.1 kb region for which the 
sequence has not been provided. Exon 3 (SEQ ID No. 10) includes the next 117 
nucleotides which specify the following 39 amino acids. The fourth intron region is 
represented by SEQ ID Nos. 11 and 12, 152 and 208 nucleotides in length, 
respectively, with a 0.9 kb region in between which has not been provided. Exon 4 

25 (SEQ ID No. 13) includes the next 78 bp which specify the following 26 amino acids 
of the enzyme. Intron region five contains SEQ ID Nos. 14 and 15, with 98 and 249 
nucleotides, respectively, with a 0.1 kb region in the middle which has not been 
provided. The fifth exon (SEQ ID No. 16) contains nucleotides for the following 41 
amino acids of human type 5 170-HSD. The sixth intron region, set forth in SEQ ID 

30 Nos. 17 and 18 with 138 and 189 bp, respectively, also includes a 2.8 kb region 
which has not been provided. Exon 6 (SEQ ID No. 19) contains nucleotides for the 
following 36 amino acids of type 5 17p-HSD, as well as two nucleotides of the codon 
227 (Trp). The next intron region includes a 136 bp portion (SEQ ID No. 20) and a 
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66 bp portion (SEQ ID No. 21) which are joined by a 0.1 kb region which is not set 
forth. Exon 7 (SEQ ID No. 22) contains nucleotides for the third nucleotide of codon 
227 (Trp) and nucleotides for the following 55 codons. The following intron region 
includes a 136 nucleotide region (SEQ ID No. 23), a 2.5 kb region which is not 
5 provided and a 286 bp region (SEQ ID No. 24). Exon 8 (SEQ ID No. 25) includes 83 
nucleotides which code for the following 27 amino acids and 2 nucleotides of codon 
310. The ninth intron region contains 713 nucleotides (SEQ ID No. 26) followed by a 
1 kb region which has not been provided followed by a 415 nucleotide region (SEQ 
ID No. 27). Exon 9 (SEQ ID No. 28) contains the third nucleotide of codon 310, 42 

10 nucleotides for the last 13 amino acids and a stop codon and approximately 200 
nucleotides in the 3 '-untranslated region. A polymorphic (GT)„ repeat region that can 
be used to perform genetic linkage mapping of the type 5 17p-HSD can be found 255 
nucleotides downstream from the TAA stop codon. SEQ ID No. 29 sets forth 109 bp 
of additional genomic sequence. The nucleotide sequence of the gene fragment, as 

15 described above, is provided in Figures 6A and 6B. 

The type 5 17p-HSD enzyme can be produced by incorporating the nucleotide 
sequence for the coding portion of the gene into a vector which is then transformed or 
transfected into a host system which is capable of expressing the enzyme. The DNA 
can be maintained transiently in the host or can be stably integrated into the genome of 

20 the host cell. In addition, the chromosomal DNA can be incorporated into a vector 
and transfected into a host system for cloning. 

In particular, for the cloning and expression of type 5 17p-HSD, any common 
expression vectors, such as plasmids, can be used. These vectors can be prokaryotic 
expression vectors including those derived from bacteriophage X such as kgtll and 

25 A.EMBL3, £. coli strains such as pBR322 and Bluescript (Stratagene); or eukaryotic 
vectors, such as those in the pCMV family. Vectors incorporating an isolated human 
cDNA shown in Sequence ID No. 1 (ATCC Deposit No. ) and the chromosomal 
DNA as shown in Sequence ID Nos. 3 through 29 (ATCC Deposit No. ) for type 5 
170-HSD have been placed on deposit at the American Type Culture Collection 

30 (ATCC, Rockville. MD), in accordance with the terms of the Budapest Treaty, and 
will be made available to the public upon issuance of a patent based on the present 
patent application. 

These vectors generally include appropriate replication and control sequences 
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which are compatible with the host system into which the vectors are transfected. A 
promoter sequence is generally included. For prokaryotes, some representative 
promoters include P-lactamase, lactose, and tryptophan. In mammalian cells, 
commonly used promoters include, but are not limited to, adenovirus, 
5 cytomegalovirus (CMV) and simian vims 40 (SV40). The vector can also optionally 
include, as appropriate, an origin of replication, ribosome binding sites, RNA splice 
sites, polyadenylation sites, transcriptional termination sequences and/or a selectable 
marker. It is well understood that there are a variety of vector systems with various 
characteristics which can be used in the practice of the invention. A map of the 

10 pCMV vector, which is an example of a vector which can be used in the practice of 
the invention, is provided in Figure 2. 

Commonly known host systems which are known for expressing an enzyme, 
and which may be transfected with an appropriate vector which includes a gene for 
Type 5 17P-HSD can be used in the practice of the invention. These host systems 

15 include prokaryotic hosts, such as £. co//\ bacilli, such as Bacillus subtilus, and other 
enterobacteria, such as Salmonella, Serratia. and Pseudomonas species. Eukaryotic 
microbes, including yeast cultures, can also be used. The most common of these is 
Saccharomyces cerevisiae. although other species are commercially available and can 
be used. Furthermore, cell cultures can be grown which are derived from mammalian 

20 cells. Some examples of suitable host cell lines include embryonal kidney (293), SW- 
13, Chinese hamster ovary (CHO), HeLa. myeloma. Jurkat. COS-1. BHK. W138 and 
madin-darby canine kidney (MDCK). In the practice of the invention, the 293 cells 
are preferred. 

Type 5 17P-HSD, whether recombinant^ produced as described herein, 
25 purified from nature, or otherwise produced, can be used in assays to identify 
compounds which inhibit or alter the activity of the enzyme. In particular, since type 
5 17P-HSD is shown to catalyze the conversion of progesterone to 20oc-OH-P and the 
conversion of A 4 -dione to testosterone, this enzyme can be used to identify compounds 
which interfere with the production of these sex steroids. It is preferred that the 
30 enzyme be obtained directly from the recombinant host, wherein following expression, 
a crude homogenate is prepared which includes the enzyme. A substrate of the 
enzyme, such as progesterone or A 4 -dione and a compound to be tested are then mixed 
with the homogenate. The activity of the enzyme with and without the test compound 
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is compared. Numerous methods are known which can be used to indicate the effects 
of the test compound on the activity of the substrate for easy detection of the relative 
amounts of substrate and product over time. For example, it is possible to label the 
substrate so that the label also stays on any product that is formed. Radioactive labels, 
5 such as C 14 or H\ which can be quantitatively analyzed are particularly useful. 

It is preferred that the mixture of the enzyme, test compound and substrate be 
allowed to incubate for a predetermined amount of time. In addition, it is preferred 
that the product is separated from the substrate for easier analysis. A number of 
separation techniques are known, for example, thin layer chromatography (TLC), high 

10 pressure liquid chromatography (HPLC), spectrophotometry, gas chromatography, 
mass spectrophotometry and nuclear magnetic resonance (NMR). However, any 
known method which can differentiate between a substrate and a product can be used. 

It is also contemplated that the gene for type 5 17p-HSD or a portion thereof 
can be used to produce antisense nucleic acid sequences for inhibiting expression of 

15 Type 5 17p-HSD in vivo. Thus activity of the enzyme and levels of its products (e.g. 
testosterone) may be reduced where desirable. In general, antisense nucleic acid 
sequences can interfere with transcription, splicing or translation processes. Antisense 
sequences can prevent transcription by forming a triple helix or hybridizing to an 
opened loop which is created by RNA polymerase or hybridizing to nascent RNA. 

20 On the other hand, splicing can advantageously be interfered with if the antisense 
sequences bind at the intersection of an exon and an imron. Finally, translation can be 
affected by blocking the binding of initiation factors or by preventing the assembly of 
ribosomal subunits at the start codon or by blocking the ribosome from the coding 
portion of the mRNA, preferably by using RNA that is antisense to the message. For 

25 further general information, see Helene et al., Biochimica et Biophysica Acta, 
1049:99-125 (1990), which is herein incorporated by reference in its entirety. 

An antisense nucleic acid sequence is an RNA or single stranded DNA 
sequence which is complementary to the target portion of the target gene. These 
antisense sequences are introduced into cells where the complementary strand base 

30 pairs with the target portion of the target gene, thereby blocking the transcription, 
splicing or translation of the gene and eliminating or reducing the production of type 5 
170-HSD. The length of the antisense nucleic acid sequence need be no more than is 
sufficient to interfere with the transcription, splicing or translation of functional type 5 
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17P-HSD. Antisense strands can range in size from 10 nucleotides to the complete 
gene, however, about 10 to 50 nucleotides are preferred, and 15 to 25 nucleotides are 
most preferred. 

Although it is contemplated that any portion of the gene could be used to 
5 produce antisense sequences, it is preferred that the antisense is directed to the coding 
portion of the gene or to the sequence around the translation initiation site of the 
mRNA or to a portion of the promoter. Some examples of specific antisense 
oligonucleotide sequences in the coding region which can be used to block type 5 170- 
HSD synthesis include: TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30); 

10 TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31); GATGAAAAGTGGACCA 
(SEQ ID No. 32); ATCTGTTGGTGAAAGTTC (SEQ ID No. 33); 
TCCAGCTGCCTGCGGT (SEQ ID No. 34); CTTGTACTTGAGTCCTG (SEQ ID 
No. 35); CTCCGGTTGAAATACGGA (SEQ ID No. 36); 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37); 

15 TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38); ATCTGAATATGGATAAT 
(SEQ ID No. 39). Examples of antisense oligonucleotide sequences which can block 



the splicing of the type 5 


17P-HSD 


premessage 


are as 


follows: 


TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40); 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41); 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42); 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43): 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44); 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45); CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46); GGAAACTTACCTATCACTGT (SEQ ID No. 47); 

25 GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). Examples of antisense 
oligonucleotide sequences which inhibit the promoter activity of type 5 17p-HSd 
include: GAGAAATATTCATTCTG (SEQ ID No. 49): 

CGAGTCCTGATAAAGCTG (SEQ ID No. 50); GATGAGGGTGCAAATAA (SEQ 
ID No. 51); GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52); 

30 CAGAGATTACAAAAACAAT (SEQ ID No. 53); 

TGCCT IT 1 "T AC ATTTTC AATC A (SEQ ID No. 54); ACACATAATTTAAAGGA 
(SEQ ID No. 55); TTAAATTATTCAAAAGG (SEQ ID No. 56): 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57): 
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CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58); CTGCCGTGATAATGCCCC 
(SEQ ID No. 59). 

As is well understood in the an, the oligonucleotide sequences can be modified 
in various manners in order to increase the effectiveness of the treatment with 
5 oligonucleotides. In particular, the sequences can be modified to include additional 
RNA to the 3 ' end of the RNA which can form a hairpin-loop structure and thereby 
prevent degradation by nucleases. In addition, the chemical linkages in the backbone 
of the oligonucleotides can be modified to also prevent cleavage by nucleases. 

There are numerous methods which are known in the art for introducing the 

10 antisense strands into cells. One strategy is to incorporate the gene which encodes 
type 5 17p-HSD in the opposite orientation in a vector so that the RNA which is 
transcribed from the plasmid is complementary to the mRNA transcribed from the 
cellular gene. A strong promoter, such as pCMV, is generally included in the vector, 
upstream of the gene sequence, so that a large amount of the antisense RNA is 

15 produced and is available for binding sense mRNA. The vectors are then transfected 
into cells which are then administered. It is also possible to produce single stranded 
DNA oligonucleotides or antisense RNA and incorporate these into cells or liposomes 
which are then administered. The use of liposomes, such as those described in 
WO95/03788, which is herein incorporated by reference, is preferred. However, 

20 other methods which are well understood in the art can also be used to introduce the 
antisense strands into cells and to administer to these patients in need of such 
treatment. 

The following is an example of the expression of human type 5 17p-HSD. 
This example is intended to be illustrative of the invention and it is well understood by 
25 those of skill in the art that modifications, alterations and different techniques can be 
used within the scope of the invention. 

Expression of 
20oc, 170-HSD (Type 5 17p-HSD) 

30 

Construction of the expression vector and nucleotide sequence determination 

The phage DNA were digested with EcoRI restriction enzyme and the resulting 
cDNA fragments were inserted in the EcoRI site downstream to the cytomegalovirus 
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(CMV) promoter of the pCMV vector as shown in Figure 2. The recombinant pCMV 
plasmids were amplified in Escherichia coli DH5a competent cells, and were isolated 
using the alkaline lysis procedure as described by Maniatis in Molecular Cloning: A 
Laboratory Manual (Cold Spring Harbor Press 1982). The sequencing of double- 
5 stranded plasmid DNA was performed according to the dideoxy chain termination 
method described by Sanger F. et ah, Proc. Natl. Acad. Sci. y 74:5463-5467 (1977) 
using a T7 DNA polymerase sequencing kit (Pharmacia LKB Biotechnology). In 
order to avoid errors, all sequences were determined by sequencing both strands of the 
DNA. The oligonucleotide primers were synthesized using a 394 DNA/RNA 

10 synthesizer (Applied Biosystem). 

As shown in Figure 2, the pCMV vector contains 582 nucleotides of the 
pCMV promoter, followed by 74 nucleotides of unknown origin which includes the 
EcoRI and Hindlll sites, followed by 432 basepairs (bp) of a small t intron (fragment 
4713 - 4570) and a poly adenylat ion signal (fragment 2825 - 2536) of SV40, followed 

15 by 156 nucleotides of unknown origin, followed by 1989 bp of the PvuII (628) to 
Aatll (2617) fragment from the pUC 19 vector (New England Biolabs) which contains 
an E. coli origin of replication and an ampicillin resistance gene for propagation in E. 
coli. 

20 Transient expression in transformed embryonal kidney (293) cells 

The vectors were transfected using the calcium phosphate procedure described 
by Kingston. R.E.. In: Current Protocols in Molecular Biology, Ausubel et al. eds., 
pp. 9.1.1 - 9.1.9, John Wiley & Sons, N.Y. (1987) and used 1 to 10 Hg of 
recombinant plasmid DNA per 10 6 cells. The total amount of DNA is kept at lOfig of 

25 plasmid DNA per 10 6 cells by completing with pCMV plasmid without insert. The 
cells were initially plated at 10 4 cells/cm 2 in Falcon® culture flasks and grown in 
Dulbecco's modified Eagle's medium containing 10% (vol/vol) fetal bovine serum 
(hyclone, Logan, UT) under a humidified atmosphere of air/CO 2 (95%/5%) at 37°C 
and supplemented with 2 mM L-glutamine. 1 mM sodium pyruvate. 100 IU 

30 penicillin/ml, and 100 |ig streptomycin sulfate/ml. 

Assay of enzymaiic activity 

The determination of enzymatic activity was performed as described by Luu- 
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The et a!.. Biochemistry, 13:8861-8865 (1991) which is herein incorporated by 
reference. See also Lachance et al., J. Biol. Chem.. 265:20469 - 20475 (1990). 
Briefly, 0.1 ^iM of the indicated ,4 C-labeled substrate (Dupont Inc. (Canada)), namely, 
dehydroepiandrosterone (DHEA), 4-androstene-3,17-dione (A 4 -dione), testosterone 
5 (T), estrone (El), estradiol (E2), dihydrotestosterone (DHT), and progesterone 
(PROG), was added to freshly changed culture medium in a 6-well culture plate. 
After incubation for 1 hour, the steroids were extracted twice with 2 ml of ether. The 
organic phase was pooled and evaporated to dryness. The steroids were solubilized in 
50 |al of dichloromethane, applied to a Silica gel 60 thin layer chromatography (TLC) 

10 plate (Merck, Darmstad, Germany) and then separated by migration in the toluene- 
acetone (4:1) solvent system (Luu-The, V. et al., /. Invest. Dermatol. , 102:221-226 
(1994) which is herein incorporated by reference). The substrates and metabolites 
were identified by comparison with reference steroids, revealed by autoradiography 
and quantitated using the Phosphoimager System (Molecular Dynamics, Sunnyvale, 

15 CA). 

Cloning of the type 5 17fi-HSD genomic DNA clone 

The hybridization and sequencing methods were as described above and as 
previously described (Luu-The et al., Moi Endocrinol., 4:268-275 (1990); Luu-The 

20 et al., DNA and Cell Biol. , 14:511-518 (1995); Lachance et al., J. Biol. Chem.. 
265:20469-20475 (1990); Lachance et al., DNA and Cell Biol. 10:701-711 (1991): 
Bernier et al., J. Biol. Chem. 269, 28200-28205, (1994) which are herein 
incorporated by reference). 

About 20 recombinant clones which gave the strongest hybridization signal 

25 were selected for second and third screening in order to isolate a single phage plaque. 
The two longest clones that hybridized with specific oligonucleotides probes located 
at the 5' and 3' regions of type 5 17P-HSD, respectively, were selected for mapping, 
subcloning and sequencing. As shown in Figures 5 and 6, the gene is included in 
approximately 16 kilobase pairs of introns and contains 9 short exons. A primer 

30 extension analysis using oligoprimer CAT-CAT-TTA-GCT-TTA-CAT-ACT-GCT-G 
located at positions 13 to 27. indicates that the start site is situated 37 nucleotides 
upstream from the ATG initialing codon. 
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The sites and signatures in the primary protein sequence were detected using 
PC/Gene software (Intelli Genetics Inc., Mountain View, CA). This analysis revealed 
a potential N-glycosylation site at Asn-198; five protein kinase C sites at Ser-73, Thr- 
82, Ser-102, Ser-121, and Ser-221; five casein kinase II phosphorylation sites at Ser- 
5 129, Thr-146, Ser-221, Ser-271, and Thr-289; two N-myristoylation sites at Gly-158 
and Gly-298; a tyrosine sulfatation site at Tyr-55; an aldo/keto reductase family 
signature 1 (25) at amino acids 158 to 168 and an aldo/keto reductase family putative 
active site signature at amino acids 262 to 280. 

As described above, the enzymatic activity of the type 5 17P-HSD was 

10 evaluated by transfecting 293 cells with vectors which included the gene encoding 
human type 5 17P-HSD. The ability of the enzyme to catalyze the transformation of 
progesterone (P) to 20oc-hydroxyprogesterone (20<x-OH-P), 4-androstenedione (A 4 - 
dione) to testosterone (T), 5oc-androstane-3,17-dione (A-dione) to dihydrotestosterone 
(DHT), dehydroepiandrosterone (DHEA) to 5-androstene-3p,17p<iiol, and estrone 

15 (El) to estradiol (E2) was analyzed. As shown in Figure 1A, the enzyme possesses 
high reductive 20oc-HSD activity, wherein progesterone (P) is transformed to the 
inactive 20<x-OH-P, and 17p-HSD activity, wherein A 4 -dione is converted to 
testosterone (T). However, 3oc-HSD activity which is responsible for the 
transformation of DHT to 5a-androstane-3ct,17p-diol is negligible. The ability of this 

20 enzyme to transform El and E2 was also negligible (Figure 1A). Figure IB shows 
that the 20oc-HSD and 17P-HSD activities increased over time. 

The isolated amino acid sequence of human type 5 17P-HSD was also 
compared with rabbit 20<x-HSD (rb), rat 20oc-HSD (r), human 3oc-HSD (h), rat 3<x- 
HSD (r), bovine prostaglandin f synthase (b pgfs), frog p-crysiallin (f p-crys) and 

25 human type 1 and type 2 17p-HSDs (h) as shown in Figure 4. These sequences show 
76.2%, 70.7%, 84.0%, 68.7%, 78.3%, 59.7%, 15.2% and 15.0% identity with type 
5 17p-HSD. respectively. 

Although the present invention has been described in relation to particular 
embodiments thereof, many other variations and modifications and other uses will be 

30 apparent to those skilled in the art. 
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SEQUENCE LISTING 



a J GENERAL INFORMATION: 



(i) APPLICANT: LUU-THE, Van 

LABRIE, Fernand 

(iii) NUMBER OF SEQUENCES: 59 
(iv) CORRESPONDENCE ADDRESS: 

(D) STATE: NY 

(E) COUNTRY: US 

20 (F) ZIP : 10036-8403 

(v) COMPUTER READABLE FORM- 

(A) MEDIUM TYPE: Floppy disk 
« <B> COMPUTER: IBM PC compatible 

25 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release 11.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 
ln (A) APPLICATION NUMBER: 

iU (B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION- 
<>* <A) NAME: Meilman, Edward* 

" JB) REGISTRATION NUMBER: 24 735 

(C) REFERENCE / DOCKET NUMBER: P/1259-313 

(ix) TELECOMMUNICATION INFORMATION • 
(A) TELEPHONE: (212) 382-0700 
W (B) TELEFAX: (212) 382-0888 

CO TELEX: 236925 

45 .2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1206 base pairs 

(B) TYPE: nucleic acid 
<ft iC) STRANDEDNESS: single 
* U (D) TOPOLOGY: linear 



<c (ix) FEATURE: 

55 (A) NAME /KEY: CDS 

(B) LOCATION: 11.. 982 
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15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GTGACAGGGA ATG GAT TCC AAA CAG CAG TGT GTA AAG CTA AAT GAT GGC 49 
Met Asp Ser Lys Gin Gin Cys Val Lvs Leu Asn Asp Gly 
I 5 10 

CAC TTC ATG CCT GTA TTG GGA TTT GGC ACC.TAT GCA CCT CCA GAG GTT 97 
His Phe Met Pro Val Leu Gly Phe Gly Thr Tyr Ala Pro Pro Glu Val 
15 20 25 

CCG AGA ACT AAA GCT TTG GAG GTC ACC AAA TTA GCA ATA GAA GCT GGG 145 
Pro Arg Ser Lys Ala Leu Glu Val Thr Lys Leu Ala lie Glu Ala Gly 
30 35 AO 45 

TTC CGC CAT ATA GAT TCT GCT CAT TTA TAC AAT AAT GAG GAG CAG GTT 193 
Phe Arg His He Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val 
50 55 60 

255 ?? C r? C c GC f* 6 A P GCA GAT GGC AGT CTG AAG AGA GAA 241 

Gly Leu Ala lie Arg Ser Lys He Ala Asp Gly Ser Val Lys Arg Glu 
65 70 75 

■yc ^ C ATA PC TAC ACT TCA AAG CTT TGG TCC ACT TTT CAT CGA CCA GAG 289 
£3 Asp He Phe Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu 
80 85 90 

TTG GTC CGA CCA GCC TTG GAA AAC TCA CTG AAA AAA GCT CAA TTG GAC 337 

■in « Ar9 Pro Ala Leu Glu Asn Ser Leu Lys Ala Gin Leu Asp 

Jl/ »5 100 105 

1*1 ?Jf GAC I AT CTT ATT CAT TCT CCA ATG TCT CTA AAG CCA GGT 

Tyr Val Asp Leu Tyr Leu He His Ser Pro Met Ser Leu Lys Pro Gly 
110 H5 120 i« 



385 



433 



35 1A ° 115 120 125 

^ fI T I" ACA GAT GAA ** r GGA AAA GTA ATA TTT GAC ATA 

Glu Glu Leu Ser Pro Thr Asp Glu Asn Gly Lys Val lie Phe Asp He 
130 135 no 

40 SI? I CT ACC ACC TGG GAG GCC ATG GAG AAG TGT AAG GAT GCA 481 

Val Asp Leu Cys Thr Thr Trp Glu Ala Met Glu Lys Cys Lys Asp Ala 
I' 5 150 

GGA TTG GCC AAG TCC ATT GGG GTG TCA AAC TTC AAC CGC AGG CAG CTG 5^9 
45 Gly Leu Ala Lys Ser He Gly Val Ser Asn Phe Asn Arg Arg GOn Leu 
A0U 165 170 

ctu Sir A Tf 2*° f* 0 GCA GGA CTC ** G TAC AAG CCT GTC TGC AAC 577 
50 V?5 Ue L6U Asn LyS Pro G1 * Leu L V S «-ys Pro Val Cys Asn 

v 1/3 1B0 185 

6?£ Sit Si! r° T S AT » CG I AT TTC AAC CGG AGT AAA TTG CTA GAT TTC 625 
Gin Val Glu Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe 

55 190 I" 200 205 



TGC AAG TC« AAA GAT ATT GTT CTG GTT GCC TAT AGT GCT CTG GGA TCT 
Cys Lys Ser Lys Asp He Val Leu Val Ala Tyr Ser Ala Leu Gly Ser 
210 215 220 

60 25 Ar™ 5^ *** CGA TGG GTG GAC CCG AAC TCC CCG GTG CTC TTG GAG 721 
Gin Arg As? Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu 
225 230 235 

65 Pr* S T= F™ l ZT GCC F 6 GCA ^ AAG CAC AAG CGA ACC CCA GCC ^59 

03 Asp Pro Va. Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala 
24v 24S ->cn 
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CTG ATT GCC CTG CGC TAC CAG CTG CAG CGT GGG GTT GTG GTC CTG GCC 817 
Leu e Ala Leu Arg Tyr Gin Leu Gin Arg Gly Val Val Val Leu Ala 
255 260 265 

AAG AGC TAC AAT GAG CAG CGC ATC AGA CAG AAC GTG CAG GTT TTT GAG 865 
Lys Ser Tyr Asn Glu Gin Arg lie Arg Gin Asn Val Gin Vai 53 gTu 



280 285 

10 H C 5t G J TG GCA GAG GAC AT G AAA GCC ATA GAT GCC CTA GAC AGA 91 3 

10 Phe Gin Leu Thr Ala Glu Asp Met Lys Ala lie Asp Gly Leu Sp 

290 295 300 



AAT CTC CAC TAT TTT AAC AGT GAT AGT TTT GCT AGC CAC CCT AAT tut 
Asn Leu His Tyr Phe Asn Ser Asp Ser Phe K Ser S£ ??o £n ?JJ 
1J 305 310 21b 



30 (2) INFORMATION FOR SEQ ID NO: 2 

35 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 324 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii> MOLECULE TYPE: protein 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asp Ser Lys Gin Gin Cys Val Lys Leu Asn Asp Gly His Phe Met 
5 10 is 

45 Pr0 Val Leu Phe Thr Tyr Ala Pro Pro Glu Val Pro Arg Se* 

20 25 30 

Lys Ala Leu Glu Val Thr Lys Leu Ala He Glu Ala Gly Phe Arg His 

40 



45 



lie Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gin Val Gly Leu Ala 



60 



lie Arg Ser Lys lie Ala Asp Gly Ser Val Lys Arg Glu Asp He Phe 

55 ° 75 80 

Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu Leu Val Arg 
85 90 95 

60 Pr ° A1 * L8U Asn Ser Leu L V S ~>' s A1 * Gin Leu Asp Tyr Val Aso 

100 105 no 

Leu Tyr Leu lie His Ser Pro Met Ser Leu Lys Pro Gly Glu Glu Leu 
•H- 120 125 

65 Ser Pro Thr Asp Glu Asn Gly Lys Val lie Phe Asp He Val Asp Leu 
1JU 135 140 



961 



pS lyl E SJ S! ?yr ™ CATGGAGACT TTGCCTGATG ATGTCTACCA 1012 
320 



GAAGGCCCTG TGTGTGGATG GTGACGCAGA GGACGTCTCT ATGCCGGTGA CTGGACATAT 1072 
CACCTCTACT TAAATCCGTC CTGTTTAGCG ACTTCAGTCA ACTACAGCTC ACTCCATAGG 1132 
25 CCAGAAATAC AATAAATCCT GTTTAGCGAC TTCAGTCAAC TACAGCTCAC TCCATAGGCC 
AGAAATACAA TAAA 



1192 
1206 
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Cys Thr Thr Trp Glu Ala Met Glu Lys Cys Lys Asp Ala Gly Leu Ala 
145 150 155 160 

Lya Ser lie Gly Val Ser Asn Phe Asn Arg Arg Gin Leu Glu Met lie 
5 165 170 175 

Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gin Val Glu 
180 185 190 

10 Cys His Pro Tyr Phe Asn Arg Ser Lys Leu Leu Asp Phe Cys Lys Ser 
195 200 205 



15 



30 



35 



55 



65 



Lys Asp lie Val Leu Val Ala Tyr Ser Ala Leu Gly Ser Gin Arg Asp 
210 215 220 

Lys Arg Trp Val Asp Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val 
225 230 235 240 



Leu Cys Ala Leu Ala Lys Lys His Lys Arg Thr Pro Ala Leu lie Ala 
20 245 250 255 

Leu Arg Tyr Gin Leu Gin Arg Gly Val Val Val Leu Ala Lya Ser Tyr 
260 265 270 

25 Asn Glu Gin Arg lie Arg Gin Asn Val Gin Val Phe Glu Phe Gin Leu 

275 280 285 



Thr Ala Glu Asp Met Lys Ala He Asp Gly Leu Asp Arg Asn Leu His 
290 295 300 

Tyr Phe Asn Ser Asp Ser Phe Ala Ser His Pro Asn Tyr Pro Tyr Ser 
305 310 315 320 

Asp Glu Tyr • 

(2) INFORMATION FOR SEQ ID NO: 3: 



Arx (i) SEQUENCE CHARACTERISTICS: 

40 (A) LENGTH: 730 base pairs 

(3) TYPE: nucleic acid 
'Z\ STRAW DEDNESS : single 
iZ) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AAGAACAAAT ACTATTAAGG CACTGCTTGC ATATATTAAA TGATGTCCAA ACTCCAAAAA 60 

CTGTTAATAA TTAACACTCC AATAAAAACT ACACCAGAA7 TTCT7TTTAT TTGCACCCTC 120 

60 ATCAGGATTA CAGCTTTATC AGGACTGCAT CTTCTTCAGA AATGAATATT 7CTCTTACAA 180 

CGCAAAGAAA GAAAAATCAA AATAAATTTT CTGATTGAAA ATGTAAAAAG GCAAATATTT 2 40 

TTACAGTTTT AACTTTAATT TTTTATTGAG GACCAACTGT TTGAAAAA7T CTCATTAGTC 300 

ATTCCTTTAA ATTATGTGTA TGTGAGAGAA AGACGTAAGA TGGTTAATTA TTTCAAATGA 360 
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•A) LENGTH: 252 base pairs 

(B) TYPE: nucleic acid 

:C) STRANDEDNESS: single 

ID) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

55 (iv) Arm-SENSE: no 



420 
480 
540 
600 



730 
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TGCAGTATAA AGAAGGGGCA TTATCACGGC AGAAACGAAA AAAGATATTT GTAGCTGGAG 

GTTTTTATAG TCTAACATAT GGTTGCTATT TGTTCTACAA ATCCTTTTGA ATAATTTAAT 
5 ATAGAGATTT CGAATAGAAA ATAATACTTT AGATAGAAAT TAATGAGTTT ATTATAACCA 

TATATTATAA TAATTTACTT AGGAATTCTC TTTGATAAGA AACAAATGAA CTGAATGCAA 
1Q TTTTCTCCAC AGACCATATA ACACTGCCTA TGTACCTCCT CCTACATGCC ATTGGTTAAC 660 

CATCAGTCAG TTTGCAGGGG TGGGGGGAGG GGTTTCCTGC CCATTGTTTT TGTAATCTCT 720 

GAGGAGAAGC 
15 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 121 base pairs 
-> n (B) TYPE: nucleic acid 

* u (C) STRANDEDNESS : single 

fD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

30 (ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 38.. 121 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

AGCAGCAAAC ATTTGCTAGT CAGACAAGTG ACAGGGAATG GATTCCAAAC AGCACTGTGT 
40 AAAGCTAAAT GATGGCCACT TCATGCCTGT ATTGGGATTT GGCACCTATG CACCTCCAGA 

G 

.2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 



60 
120 
121 



W (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTAAGAATAA TTCCTTTTAG TTTTCGGATT TCAAAAGAAT AAACCTAGTA GAAGTGAAAC 
65 CCGTATTGGG 7TGTAAGGTT CGTGTTCCTA CCTTACTCTG GATGACTCAC TGGTCTAGGT 

TTCCTAGGC7 AGGAGAAAAA AGTAGGCAAT CCTTGTTCTG CATTGAGGTC CATTCCTATG 



substttute sheet (rule 26) 



WO 97/1 1162 



PCT/CA96/00605 



20 



30 



35 



40 



45 



55 



60 



65 



(iv) ANTI-SENSE: NO 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
<iii> HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 



252 
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GTCACGTACT GCTTATTTTT CGTTTGTGCA CTGTTTCTTT CTTCTGTTCA TGTCTAGTTC 240 
CCAGCTTGGC AG 
5 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 410 base pairs 
tA (B) TYPE: nucleic acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

25 GGAAGTCTGA GTGAGCATTC TGTGTAATAT CACTGGGAGA GAACTCATAT GAGCTTGCAC 60 

CGTTTCCCTT CT AT ACT CCA TGTGATTTTT ACCATGTATA ATATCACTAT ATTAAAAATA 120 

ATTAGGACTA TTTCAGTCAT GTTAACTTTT CCAACAAATC ACTGAATCTG AGGGTGTTAT 180 

GTGGTACCTC CATAACAGTG ATCAACCAGA GATTGCCTGA GACTGAAGGT GTTTCTGGGA 240 

TGCTCAACCT TTATTACTAA CCAGGAAAGA CTCAGGCAAA CTGAGATGGA CTTTT CACCC 300 

CACATACAGA CAGGAGGAAA AGCTGATTCT TGTATAAAAG TCAATGCTTG TGCCTGAACT 360 

ACCTCTCAGC CACAGTGATC ACCAGATACT ACCTTTGGTT GCTCCTCCAG 410 
(2) INFORMATION FOR SEQ ID NO: 7: 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..168 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 7: 
GTTCCGAGAA GTAAAGCTTT GGAGGTCACA AAATTAGCAA TAGAAGCTGG GTTCCGCCAT 60 
ATAGATTCTG CTCATTTATA CAATAATGAG GAGCAGCTTG GACTGGCCAT CCGAAGCAAG 120 
ATTGCAGATG GCAGTGTGAA GAGAGAAGAC ATATTCTACA CTTCAAAG 168 
(2) INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 700 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDE DNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
iiv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ACTAGATGGC ACAAAGTAAT AAGATTTSCT CAAGCATTCA TTCAAAATCA CCTCCATTCT 
TTAACCTCTG CAG 
65 (2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GTACTGTGTC TATGATGAGC TTGTGTGCAC ATGTATTTAT TGTGATTGTG TGGAGATGAC 60 

AATTCTATGA CTGGATGAGT AGTTGTGGGT GAATTTTGCT TCTGGGTTCA AATTTATTCA 120 

CACATACTCA CATACTAAAA CTGAAATCAA AATCAAGGAA TGATGATCAC TTTTCATTTT 180 

^ GGCTGTGTTC CAATTTATGA CCTGAAAGTC CCTTTACTTT TTTGAGCTTC AGCCGAGATC 24 0 

AGTGTGATTT GACATGTGCT ATAGAATCAC AGAGAACAAT AATCATGTTA TGGTTTTTCT 300 

TATCGCCTGG GTGATTTTCT AAGATTTCTT ATTATTCTCT CAATTGCTAT CTTTATCAGT 360 

GAGATAGAAA GCAATATAAG AAAGCTCTGG GAGTATTAAA TAATAGACAC TTAAATTGTC 420 

CTAAATTGTG TCCAGCATAG TGAGCATGTT CAAAACTTGT TTTACCCCCC TTTTATGTTG 480 

^ 5 CTTTAGTTTC TAAGCAACAT AAATAGCTAT TCTTAAGCAT TGGGTTGAAT GGATAGAAGA 540 

ATTAGACTGT TAAAATGAGT TGTAAACTCT ACTGAAGATA ATTCAGGTAA CATCATAGTT 600 

ATTACTTAAT ACTAATCTTT ACATTTTAAG AATTTACTCC TATCATTCAG TAGATGTACA 660 

40 AACTATACAT CCAACGTATA ATAAAGTTTA TAAGGATAGG 

(2) INFORMATION FOR SEQ ID NO: 9: 

<i> SEQUENCE CHARACTERISTICS: 
+5 (A) LENGTH: 7 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
^ (iv > ANTI-SENSE: NO 
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(A) LENGTH: 117 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii> MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
10 (iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: exon 
15 (B) LOCATION: 1..117 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
20 CTTTGGTCCA CTTTTCATCG ACCAGAGTTG GTCCGACCAG CCTTGGAAAA CTCACTGAAA 60 
AAAGCTCAAT TGGACTATGT TGACCTCTAT CTTATTCATT CTCCAATGTC TCTAAAG 117 
(2) INFORMATION FOR SEQ ID NO: 11: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 152 base pairs 

<B) TYPE: nucleic acid 
„ (CJ STRANDEDNESS: single 

30 (D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



35 



40 

(xi) SEOUENCE DESCRIPTION: SEQ ID NO: 11: 
GTATGCAGTT TGTATGAGCA TAAAATTGCG CTTCTGCTGT CATTATAAAC ATTGTTTATC 60 
45 TGGATAGTTG AACAGAGCTT TTTATTAGGA GGATGTAGGG ATTATCACAC AGAAGAAGAA 120 
CCGTAAGTGG AACACCTAAT TTCCTTTCTT TC 152 
(2) INFORMATION FOR SEQ ID NO: 12: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



60 



65 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEOUENCE DESCRIPTION: SEQ ID NO: 12: 
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ATATAATATT TGTAAGAGAT TAGAGGAAGC CTGTCTCCTG AATACATTCC TTATACCTTC 60 

ATATGTAAAA CACTTAGCAC ATATCACTTT CTGGAGCATT GTACCACCTG TCTCATGGAG 120 

3ATTAGTGTC CTTAAAGGTA CCTGGGGTTA CAGCTATGAG TGGAGAAATT AATTTGTGAC 180 

ATCATTAAAA TGACTGCTTC TATTTCAG ,_ Q 
:2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 



1C (C) STRANDEDNESS: single 

13 <D) TOPOLOGY: linear 

20 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..78 



(xi) SE0UENCE DESCRIPTION: SEO ID NO: 13: 
CCAGGTGAGG AACTTTCACC AACAGATGAA AATGGAAAAG TAATATTTGA CATAGTGGAT €0 
rrCTGTACCA CCTGGGAG 

78 

35 2) INFORMATION FOR SEO ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 
^ (B) TYPE: nucleic acid 

W ( C ) STRANDEDNESS: single 

'D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
45 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

50 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
55 STGAGTGCT? GGCGGAGAGG ACACAGAGAA GGATGACAAA AAGAGAAAAT CTGTTTCCCA 60 
3GTTCGATAG GAAAGAATGG AATATGCACC ATTAGATC 98 
2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

:A) LENGTH: 24 9 base pairs 
'.3) TYPE: nucleic acid 
:C) STRANDEDNESS: single 
:2) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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fiii) HYPOTHETICAL: NO 
<iv> ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

10 GACAGGAATC TCTTTCCTTG CTTGTGCATT AATCTATGCA GTTTCCTAAG GAAGAGATAG 60 

AAATTCTTAC TCTTGCTGCC TCTATCTTCT TCCCCTATTT GCTGTTTGAA rTTTTCTTTT 120 

TTTGACAATC ACTGCTAGCT ATTTTCATTG TCATACTTTG AAAGTTGTTG CTCTCACAGT 180 

TCTGTCTTGC ATTTACCGTG ATTTGCAGCC AACTGCACAA ATAATTCCTC ACAACCCCTT 240 

TCTCCACAG 249 

20 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 base pairs 
„ (B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..123 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCATGGAGA AGTGTAAGGA TGCAGGATTG GCCAAGTCCA TTGGGGTGTC AAACTTCAAC 60 
45 CGCAGGCAGC TGGAGATGAT CCTCAACAAG CCAGGACTCA AGTACAAGCC TGTCTGCAAC 120 

CAG 123 

(2; INFORMATION FOR SEQ ID NO: 17: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ KD1 TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fiii) HYPOTHETICAL: NO 

60 (iv) ANTI-SENSE: NO 



(xi) SE0UENCE DESCRIPTION: SEQ ID NO: 17: 
GTGAGCTCCC TTGGCCTTCT CTCCTTTCGG TTCTTCATGC CCCCTCTTCC TGTCCTATTG 60 
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CCAAATATCT CTTTCTTTTG TCCCAGTTAT CTTTGTGAAG TAGAAGATTA TCTAGAGAGC 120 
AAAGCTTCTG TCAAGAAA 

5 136 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
tA (A) LENGTH: 189 base pairs 

(B) TYPE: nucleic acid 



15 



20 
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(CJ STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

25 ATTTCCATTT ATACTTTTAG AAGATATATA AAATTTATTT CTATGAAAAA GGTTATTACT 60 

TGACAATAAT ATCCTCAGCT CAAATATAAT GCTATACTGA TTATTATTCA GCTTCCTTAC 120 

30 TTTCATCTTT TCAATATTAA CATAACTATT TCATATAAAT TGATGCTTCT CTCTTTTGGT 180 
CAACTGCAG 

189 

(2) INFORMATION FOR SEO ID NO: 19: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 110 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



<D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
45 (iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..110 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
55 GTAGAATGTC ATCCGTATTT CAACCGGAGT AAATTGCTAG ATTTCTGCAA GTCGAAAGAT 60 
ATTGTTCTGG TTGCCTATAG TGCTCTGGGA TCTCAACGAG ACAAACGATG 110 
(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 136 base pairs 

(B) TYPE: nucleic acid 
« tC) STRANDEDNESS : single 
QJ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(iii.) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



<*i> SEQUENCE DESCRIPTION: SEQ ID NO:20: 
GTAATAAAAA CAATGGGACC TTTACATAAA CCTTCATTTT GCAGAAAATT TTTTAGTCAG 60 
AGCATCCTCA GTTTCCTGTA GTTAAGTTTC AAGTGGCTCA TGGAGAGGAA AGAGAATTGC 120 
15 GTTTCTGACG AGATCT 136 
(2) INFORMATION FOR SEQ ID NO: 21: 

„ fi) SEQUENCE CHARACTERISTICS : 

20 (A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(CJ STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



^ < xi) SEOUENCE DESCRIPTION: SEQ ID NO: 21: 

TTTAGGGAGC TGCCTAACAA ACTATCGGCA GCCTCAGGGC CTCAGCCTTT CTGCCTTTCC 60 
TTCCAG 6 £ 

40 (2) INFORMATION FOR SEQ ID NO:22: 

<i) SEQUENCE CHARACTERISTICS: 

fA) LENGTH: 166 base pairs 
AC <B) TYPE: nucleic acid 

*3 CC) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..166 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

GGTGGACCCG AACTCCCCGC TGCTCTTGGA GGACCCAGTC CTTTGTGCCT TGGCAAAAAA 60 

^ GCACAAGCGA ACCCCAGCCC TGATTGCCCT GCGCTACCAG CTGCAGCGTG GGGTTGTGGT 120 

CCTGGCCAAG AGCTACAATG AGCAGCGCAT CAGACAGAAC GTGCAG 166 
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!2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 136 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
Uv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



3TGAGGAGCG GGGCTGTGGG CCTCAGGTCT CCTGCACAGT GTCCTTCACA CGTGTGCTTC 
TTGTAAGGCT CTCAGGACAG CCTTGGGCCA GCTCCATTTC CCTGTATTTC CCATATGAAT 



60 
120 

3CTTTGCGTG CATCCT ... 
O 136 

12) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 
- A (A) LENGTH: 286 base pairs 

■W (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 24: 

45 :CCTATCATG TGGGCACAAT GTCAGCGCTG TTTCTTCTCC ATTTTCTGTT GAAATTTTCT 
rrTTCTCTGC AGAGTTGCAC AGTTTCAATA CATAATATCT AGGAATGGAT TTCTGCTTAT 

50 ^TTTCGTGAG CTATTCATTG ACCCACCTGA GTGTTTAGAG CTGACTTCTA TAACTGTTTA 
AAACTTACCA ATATTTTAAG TATTGTCTCT GCACCCTACT GTCTAATATA CTTGGGGATT 
^ACAACTGGC AATCTAAAAA TAATAAAAGT TTTTTATTTC TGATAG 

55 .2) INFORMATION FOR SEQ ID NO: 25: 



60 
120 
180 
240 
286 



(i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 83 base pairs 
.3) TYPE: nucleic acid 
Z) STRANDEDNESS: single 
O) TOPOLOGY: linear 

(ii) MCLECULE TYPE: DNA (genomic) 

65 (iii) HYPOTHETICAL: NO 

(iv) PJm-SENSE: NO 
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(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 1..83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
10 GTTTTTGAGT TCCAGTTGAC TGCAGAGGAC ATGAAAGCCA TAGATGGCC7 AGACAGAAAT 60 
CTCCACTATT TTAACAGTGA TAG 83 
(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 713 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL : NO 

Uv) ANTI-SENSE: NO 



25 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
GTAAGTTTCC TTTGTAAATG GGTGATCTAA TTTATTTCTG GAGAAGGAAT GTAGGATGGG 60 

35 TGTTGAGAGT GACCTCCATA CCAGAGGGAC AGAGGCCAAT GTGAGTCAGA GGTGAGACTG 120 

GAACTCTCCT GCTGGATTCA CTCCAGAGCT CTGTTCTCTG GCAGGGTGAG TGGGCAGGGA 180 

TCAGCATGGG TCAACCTGTG CCTCTGCTCT CCTGACTCCA TGGAACTTTC CAGAGCAGCC 24 0 

40 

AACATCATTG CCAAGTCTGC ACGTTCCATA TAGGCCTGGT GTTTCTACCA CTGGACATGC 300 

TGTGGATACT GCCCATGTGA CTTCATTAGA TGTTTCCAAA TCTGTGCTTA 7ATCACATTG 360 

45 TCCCAAACCT GCTCAGCTCC TTATCAAATC AAAAACATTT CCATCAACTT TGTGGTCCAG 4 20 

GTGCCAATTC CCACCTCCTT CATATGGAAT TGCTTGCTAG ATCCTGTCAA TTCAGCATCT 4 80 

TTTATTATTT CAAATGTTTT TCCTCCTTCT CCTTGCACGT TTGTTCATGC CCCAAACTCT 54 0 

50 

GCTTTTGCCT CCAGAAAGCC TTCCTTAGTG GAGTGAATAG GAGTGCTTGT CCTTGATTTC 600 

CTGCAATATG GAGCTCTCAA GGCAGAGAAT TTAAAAAAAT TTAAAATCAA GGAGTGTGAG 660 

55 TGTGGAGGCA GAAGCTCCAT TGTTGTATAT AATTTGTAGC TGATAAAAGA TCT 713 

(2) INFORMATION FOR SEO ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS : 
60 .A) LENGTH: 415 base pairs 

3) TYPE: nucleic acia 
.C) STRANDEDNESS: single 
!2) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TTTAATGCAC TGTAGCTCCT TGGATATTAG ACCCTATATC ATATATAACA ATTTACATTT 60 

CTGAATCTTA CAAAATATAT TGCATACAGT AGGCAGTAGC AGGTAATAAG TAAAGTAACA 120 

AAAGAAAGTA TAATCAGAGT ATCTCTGCTC TGCTGACAGA TGTACAGGAA TATACTTGAA 180 

TATTTGACTT TGTGTGTTTT ACGTGTTAAC TTCCAGATAA GGGAATATGA TTGAATAATT 24 0 

TATTATTTTG AAAATACTGT ATTATGAAGC CATGTTCATA AAGGTAAGAA AGGCAGATTC 300 

TACAACTAGT CAGACAACTT AACATTCATA CTAATGACAG CTTCATTGAA ATCACTTTAC 360 

TACTCCCCTA GTAATGGAGT CATTGCATTT ATATTATACA TTATTCTCTT TTCAG 415 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

fii) MOLECULE TYPE: DNA (genomic) 
Uii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

IA) NAME/KEY: exon 

(B) LOCATION: 1..230 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

TTTTGCTAGC CACCCTAATT ATCCATATTC AGATGAATAT TAACATGGAG GGCTTTGCCT 60 

GATGATGTCT ACCAGAAGGC CCTGTGTGTG GATGGTGACG CAGAGGACGT CTCTATGCCG 120 

GTGACTGGAC ATATCACCTC TACTTAAATC CGTCCTGTTT AGCGACTTCA GTCAACTACA ISO 

GCTGAGTCCA TAGGCCAGAA AGACAATAAA TTTTTATCAT TTTGAAATAA 230 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 

IB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
TTGAATGTTT TCTCAAAGAT TCTTTACCTA CTCTGTTCTG TAGTGTGTGT TTTCTTCTGG 60 
CTCAGAAGTG TGTGTGTGTG TGTGTGTGCT TTCTTCTGGC TCAACAGGG 109 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTTAGCTTTA CACACTGCTG TT 22 
30 (2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genomic) 
40 (iii) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 : 
TCCAAAGCTT TACTTCTCGG 20 
(2) INFORMATION FOR SEQ ID NO: 32: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 16 base pairs 
55 3) TYPE: nucleic acid 

;C) STRANDEDNESS: single 
D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ASTI- SENSE: YES 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
3ATGAAAAGT GGACCA !6 
5 2) INFORMATION FOR SEQ ID NO: 33: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IB base pairs 
4rt (B) TYPE: nucleic acid 

10 !C> STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



20 



25 



35 



40 



60 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
ATCTGTTGGT GAAAGTTC 18 
2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
„ (A) LENGTH: 16 base pairs 

30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

45 TCCAGCTGCC TGCGGT 16 

2) INFORMATION FOR SEQ ID NO: 35: 

c _ (i) SEQUENCE CHARACTERISTICS: 

50 ;A) LENGTH: 17 base pairs 

■3) TYPE: nucleic acid 

«'C) STRANDEDNESS: single 

.D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) A::?I-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ZTTGTACTTG AGTCCTG 17 
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15 



40 



65 
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i2: INFORMATION FOR SEQ ID NO: 36: 

<i) SEQUENCE CHARACTERISTICS: 
fA) LENGTH: 18 base Dairs 
:2) TYPE: nucleic acid 
:C) STRANDEDNESS: single 
IZ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

I iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 3c: 

20 CTCCGGTTGA AATACGGA 

»2) INFORMATION FOR SEQ ID NO: 21: 

ii) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 20 base pairs 

^3) TYPE: nucleic acid 
IS) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

30 {ii) MOLECULE TYPE: DNA (genomic) 

•iii) HYPOTHETICAL: NO 

(iv) a::ti-sense: yes 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3"': 

CATCGTTTGT rTCGTTGAGA 

INFORMATION FOR SEQ ID NO:3fc: 

45 !i) SEQUENCE CHARACTERISTICS : 

Ai LENGTH: 22 base pairs 
2) TYPE: nucleic acis 
Z) STRANDEDNESS: smaie 
.r* TOPOLOGY: linear 

50 

(ii) t'CLECULE TYPE: DNA i genomic j 
.iii) HYPOTHETICAL: NO 

55 :iv> a.*:tx-sense: yes 

60 {xi) SEQUENCE DESCRIPTION: SZZ ID NO: 39 : 

TCASTGTTA.-. AATAGTGGAG AT 
2. IN FOR*l-.7 ION FOR SEQ ID SO : 3 r : 



S£:UENCE CHARACTERISTICS: 
A I LENGTH: 17 base oa— s 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

10 



15 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO:39: 
ATCTGAATAT GGATAAT 17 
(2) INFORMATION FOR SEQ ID NO: 40: 



20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
(Z) STRANDEDNESS: single 

^ <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
30 (iv) ANTI-SENSE: YES 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 

TTCTCGGAAC CTGGAGGAGC -n 
(2) INFORMATION FOR SEQ ID NO: 41: 

40 

(i) SEQUENCE CHARACTERISTICS: 

!A) LENGTH: 20 base pairs 

*3) TYPE: nucleic acio 
AC :Z) STRANDEDNESS: single 

45 TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) A.VTI -SENSE: YES 



50 



55 

(xi) SEQUENCE DESCRIPTION: SEC ID NO: 41: 

GACACAGTAC CTTTGAAGTG 2C 

60 (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 
A) LENGTH: 20 base pairs 
2) TYPE: nucieic acid 
M Z) STRANDEDNESS: single 

.Dl TOPOLOGY; linear 
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(ii) MOLECULE TYFE: DNA I genomic I 
(iii) HYPOTHETICAL: NO 
5 (iv) ANTI-SENSE: YES 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGGACCAAAG CTGCAGAGGT 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



25 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CCTCACCTGG CTGAAATAGA 20 
35 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

•D) TOPOLOGY: linear 

(ii) MOLECULE TYrI: DMA ; genomic; 
45 '.iii) HYPOTHETICAL: NO 

(iv) ANT I -SENSE : YES 



50 



55 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
AAGCACTCAt CTCCCAGGTG 
(2) INFORMATION FOR SEQ ID NO: 45: 



ii) SEQUENCE CHARACTERISTICS: 
•A) LENGTH: 20 base cairs 
60 .3) TYPE: nucleic acic 

-C) STRANDEDNESS: sir.gle 
•ID) TOPOLOGY: linear 



;ii) XOLECULE TYPE: DNA -genomic) 
:iii) HYPOTHETICAL: NO 
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(iv) Ajm-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4S: 

3ACATTCTAC CTGCAGTTGA 20 

10 .2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
;'A) LENGTH: 19 base pairs 
?B) TYPE: nucleic acid 
15 X) STRANDEDNESS: single 

:D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 uii) HYPOTHETICAL: NO 

<iv) Ai:7I-SENSE: YES 



25 



30 



35 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 6: 

CTCAAAAACC TATCAGAAA 19 

.2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 
A) LENGTH: 20 base pairs 
!3> TYPE: nucleic acid 
Z) STRANDEDNESS: single 
*D> TOPOLOGY: linear 

(ii) KCLECULE TYPE: DNA (genomic) 

HYPOTHETICAL: NO 

tiv) Aj:7I-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4*?: 

50 3GAAACTTAC CTATCACTGT 2 0 

2) INFORMATION FOR SEQ ID NO: 48: 

cc ii) 2IQUENCE CHARACTERISTICS: 

55 A) LENGTH: 20 base pairs 

3) TYPE: nucleic acid 

Z) STRANDEDNESS: single 

:i TOPOLOGY: linear 

60 :ii) r-'.CLECULE TYPE: DNA (genomic) 

:iii) K POT HET I CAL : NO 
, c livi AJiTI-SENSE: YES 

65 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

^ GCTAGCAAAA CTGAAAAGAG 20 

(2) INFORMATION FOR SEQ ID NO:49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1*7 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



45 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

25 GAGAAATATT CATTCTG 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 18 base pairs , 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5C: 

CGAGTCCTGA TAAAGCTG 18 

(2) INFORMATION FOR SEQ ID NO: 51 : 

50 (i) SEQUENCE CHARACTERISTICS: 

I A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

fiii) HYPOTHETICAL: NO 

60 (iv) ANTI-SENSE: YES 



65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GATGAGGGTG CAAATAA H 



SUBSTITUTE SHEET (RULE 26) 



WO 97/11162 



PCT/CA96/00605 



-40- 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
J (A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
tC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

10 (ii> MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(ivj ANTI -SENSE : YES 



2^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GGAGTGTTAA TTAATAACAG TTT 
(2) INFORMATION FOR SEQ ID NO: 53: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
3q (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

35 (iv) ANTI -SENSE: YES 



40 



45 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

CAGAGATTAC AAAAACAAT 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
(3) TYPE: nucleic acid 
!C) STRANDEDNESS: single 
fD) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

;iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



60 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 54 ; 
TGCCTTTT7A CATTTTCAAT CA 
65 (2: INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
10 (iv) ANTI-SENSE: YES 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ACACATAATT TAAAGGA 17 
(2) INFORMATION FOR SEQ 10 NO: 56: 

20 

(i) SEQUENCE CHARACTERISTICS: 
fA) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



30 



35 

(Hi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
TTAAATTATT CAAAAGG 17 
40 (2i INFORMATION FOR SEQ ID NO: 57: 

<;i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic! 
50 fiii) HYPOTHETICAL: NO 



55 



60 



(iv) ANTI-SENSE: YES 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 
AAGAGAAATA TTCATTTCTG 20 
(2! INFORMATION FOR SEQ ID NO: 58: 
!i) SEQUENCE CHARACTERISTICS: 



{A) LENGTH: 20 base pairs 

65 (B) TYPE: nucleic acid 

:C) STRANDEDNESS: single 

;D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: YES 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CCCCTCCCCC CACCCCTGCA 20 
15 :2> INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CTGCCGTGAT AATGCCCC 18 
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CLAIMS 
We claim: 



5 1. An isolated nucleotide sequence encoding type 5 17p-hydroxysteroid 
dehydrogenase, said sequence being sufficiently homologous to SEQ ID No. 1 or a 
complement thereof, to hybridize under stringent conditions to the coding region of 
SEQ ID No. 1 or a complement thereof and said sequence encoding an enzyme which 
catalyzes the conversion of progesterone to 20oc-hydroxyprogesterone and the 

10 conversion of 4-androstenedione to testosterone. 



2. The nucleotide sequence, as recited in claim I, wherein said sequence is the 
coding region of SEQ ID No. 1. 

15 3. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 1 . 

4. A recombinant expression vector comprising a promoter sequence and a 
nucleotide sequence in accordance with claim 2. 

20 

5. A recombinant host cell, transformed or transfected with the vector of claim 4. 

6. The recombinant host cell of claim 5. wherein said host cell is a eukaryotic 
cell. 

25 

7. A recombinant host cell, transformed or transfected with the vector of claim 3. 



8. The recombinant host cell of claim 7. wherein said host cell is a eukaryotic 
cell. 

30 

9. The recombinant host cell of claim 8. wherein a nucleotide sequence that 
hybridizes under stringent conditions with SEQ ID No. I or its complement is 
integrated into the genome of said host cell. 



SUBSTITUTE SHEET (RULE 26) 



WO 97/1 11 62 



PCT/CA96/00605 



-44- 

10. The recombinant host cell of claim 9, wherein said nucleotide sequence is 
located on a recombinant vector. 

5 11. The recombinant host cell, as recited in claim 8, wherein said host cell is 
capable of expressing a biologically active type 5 17p-hydroxysteroid dehydrogenase. 

12. An isolated nucleotide sequence comprising at least ten consecutive nucleotides 
identical to 10 consecutive nucleotides in the coding region of SEQ ID No. 1, or the 

10 complement thereof. 

13. The nucleotide sequence, as recited in claim 12, wherein said sequence 
comprises at least fifteen consecutive nucleotides identical to 15 consecutive 
nucleotides in the coding region of SEQ ID No. 1, or the complement thereof. 

15 

14. The nucleotide sequence, as recited in claim 13. wherein said sequence 
comprises at least twenty consecutive nucleotides identical to 20 consecutive 
nucleotides in the coding region of SEQ ID No. 1. or the complement thereof. 

20 15. The nucleotide sequence, as recited in claim 13. wherein said sequence 
comprises at least thirty consecutive nucleotides identical to 30 consecutive nucleotides 
in the coding region of SEQ ID No. 1 . or the complement thereof. 

16. An oligonucleotide sequence selected from the group consisting of 
25 TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30), 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), G ATG A AAAGTGG ACC A 

(SEQ ID No. 32). ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 

TCCAGCTGCCTGCGGT (SEQ ID No. 34). CTTGTACTTGAGTCCTG (SEQ ID 

No. 35). CTCCGGTTGAAATACGGA (SEQ ID No. 36). 
30 CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37), 

TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38). and 

ATCTGAATATGGATAAT (SEQ ID No. 39). 
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17. An oligonucleotide sequence selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40). 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42). 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43), 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44), 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 

10 

18. An oligonucleotide sequence selected from the group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50). GATGAGGGTGCAAATAA (SEQ ID No. 51). 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52). 

15 CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

TGCCTTTTTACATTTTCAATCA (SEQ ID No. 54), ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57), 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 

20 CTGCCGTGATAATGCCCC (SEQ ID No. 59). 



19. A recombinant expression vector comprising: 
a promoter sequence; and 

an oligonucleotide sequence selected from the group consisting of 
25 TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30), 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31). GATGAAAAGTGGACCA 
(SEQ ID No. 32), ATCTGTTGGTGAAAGTTC (SEQ ID No. 33), 
TCCAGCTGCCTGCGGT (SEQ ID No. 34), CTTGTACTTGAGTCCTG (SEQ ID 
No. 35). CTCCGGTTGAAATACGGA (SEQ ID No. 36). 
30 CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37). 

TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39). 
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20. A recombinant expression vector comprising: 
a promoter sequence; and 

an oligonucleotide sequence selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40), 


GACACAGTACC1T1 GAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42), 


cctcacctggctgaaXtaga 


(SEQ 


ID 


No. 


43), 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44), 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 



10 (SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48). 

21. A recombinant expression vector comprising : 
a promoter sequence; and 

15 an oligonucleotide sequence selected from the group consisting of 

GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50), GATGAGGGTGCAAATAA (SEQ ID No. 51), 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52). 

CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

20 TGCC 1 1 1 Tl AC ATTTTC AATC A (SEQ ID No. 54), ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56). 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57). 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58). and 
CTGCCGTGATAATGCCCC (SEQ ID No. 59). 

25 

22. A method of blocking synthesis of type 5 17p-HSD. comprising the step of: 
introducing an oligonucleotide selected from the group consisting of 

TTTAGCTTTACACACTGCTGTT (SEQ ID No. 30). 

TCCAAAGCTTTACTTCTCGG (SEQ ID No. 31), GATGAAAAGTGGACCA 
30 (SEQ ID No. 32), ATCTGTTGGTGAAAGTTC (SEQ ID No. 33). 
TCCAGCTGCCTGCGGT (SEQ ID No. 34). CTTGTACTTGAGTCCTG (SEQ ID 
No. 35), CTCCGGTTGAAATACGGA (SEQ ID No. 36), 
CATCGTTTGTCTCGTTGAGA (SEQ ID No. 37). 
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TCACTGTTAAAATAGTGGAGAT (SEQ ID No. 38), and 
ATCTGAATATGGATAAT (SEQ ID No. 39) into cells. 

23. A method of blocking synthesis of type 5 17(J-HSD, comprising the step of: 
5 introducing an oligonucleotide selected from the group consisting of 



TTCTCGGAACCTGGAGGAGC 


(SEQ 


ID 


No. 


40), 


GACACAGTACCTTTGAAGTG 


(SEQ 


ID 


No. 


41). 


TGGACCAAAGCTGCAGAGGT 


(SEQ 


ID 


No. 


42), 


CCTCACCTGGCTGAAATAGA 


(SEQ 


ID 


No. 


43), 


AAGCACTCACCTCCCAGGTG 


(SEQ 


ID 


No. 


44), 



GACATTCTACCTGCAGTTGA (SEQ ID No. 45), CTCAAAAACCTATCAGAAA 
(SEQ ID No. 46), GGAAACTTACCTATCACTGT (SEQ ID No. 47), and 
GCTAGCAAAACTGAAAAGAG (SEQ ID No. 48) into cells. 

15 24. A method of blocking synthesis of type 5 17p-HSD, comprising the step of: 

introducing an oligonucleotide selected from the group consisting of 
GAGAAATATTCATTCTG (SEQ ID No. 49). CGAGTCCTGATAAAGCTG (SEQ 
ID No. 50), GATGAGGGTGCAAATAA (SEQ ID No. 51), 
GGAGTGTTAATTAATAACAGTTT (SEQ ID No. 52), 

20 CAGAGATTACAAAAACAAT (SEQ ID No. 53), 

TGCCn 1I1ACATTTTCAATCA (SEQ ID No. 54), ACACATAATTTAAAGGA 
(SEQ ID No. 55), TTAAATTATTCAAAAGG (SEQ ID No. 56), 
AAGAGAAATATTCATTTCTG (SEQ ID No. 57), 

CCCCTCCCCCCACCCCTGCA (SEQ ID No. 58), and 

25 CTGCCGTGATAATGCCCC (SEQ ID No. 59) into cells. 

25. An isolated chromosomal DNA fragment which upon transcription and 
translation encodes type 5 17P-hydr6xysteroid dehydrogenase and wherein said 
fragment contains nine exons and wherein said fragment includes introns which are 16 

30 kilobase pairs in length. 

26. An isolated DNA sequence encoding type 5 170-hydroxysteroid 
dehydrogenase, said sequence being sufficiently homologous to SEQ ID No. 3 or a 



SUBSTITUTE SHEET (RULE 26) 



WO 97/11162 



PCT/CA96/00605 



10 



-48- 

complement thereof, to hybridize under stringent conditions to SEQ ID No. 3, or its 
complement. 

27. A method for producing type 5 17p-hydroxysteroid dehydrogenase, comprising 
the steps of: 

preparing a recombinant host transformed or transfected with the vector of 
claim 3; and 

culturing said host under conditions which are conducive to the production of 
type 5 17p-hydroxysteroid dehydrogenase by said host. 



28. A method for determining the inhibitory effect of a test compound on the 
enzymatic activity of type 5 17p -hydroxy steroid dehydrogenase, comprising the steps 
of: 

providing type 5 17P-hydroxysteroid dehydrogenase; 
15 contacting said type 5 17p-hydroxy steroid dehydrogenase with said test 

compound; and thereafter 

determining the enzymatic activity of said type 5 17p-hydroxysteroid 
dehydrogenase in the presence of said test compound. 

20 29. The method, as recited claim 28, wherein said step of determining enzymatic 
activity includes the steps of: 

adding a substrate which is metabolized by said type 5 17P-hydroxysteroid 
dehydrogenase; and 

determining an amount of said substrate which is converted to metabolite. 



25 



30 



30. A method of interfering with the expression of type 5 17p-hydroxysteroid 
dehydrogenase, comprising the step of administering nucleic acids substantially 
identical to at least 15 consecutive nucleotides of SEQ ID No. 1 or a complement 
thereof. 

31. A method of interfering with the synthesis of type 5 17p-hydroxysteroid 
dehydrogenase, comprising the step of administering amisense RNA complementary 
to mRNA encoded by at least 15 consecutive nucleotides of SEQ ID No. 1 or a 
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complement thereof. 

32. A method of interfering with the expression of type 5 17p-hydroxysteroid 
dehydrogenase, comprising the step of administering nucleic acids substantially 

5 identical to at least 15 consecutive nucleotides of SEQ ID No. 3 or a complement 
thereof. 

33. A method of interfering with the synthesis of type 5 17p-hydroxysteroid 
dehydrogenase, comprising the step of administering antisense RNA complementary 

10 to mRNA encoded by at least 15 consecutive nucleotides of SEQ ID No. 3 or a 
complement thereof 

34. A method for determining the inhibitory effect of antisense nucleic acids on the 
enzymatic activity of type 5 17p -hydroxy steroid dehydrogenase, comprising the steps 

15 of: 

providing a host system capable of expressing type 5 1 7 p-hydroxy steroid 
dehydrogenase; 

introducing said antisense nucleic acids into said host system; and thereafter 
determining the enzymatic activity of said type 5 17p-hydroxysteroid 
20 dehydrogenase. 
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